This application is a Non-Provisional Application of commonly assigned and co-pending Indian Patent Application Serial Number 202211039342, filed Jul. 8, 2022, and co-pending Indian Patent Application Serial Number 202211007031, filed Feb. 10, 2022, the disclosures of which are hereby incorporated by reference in their entireties.
With respect to floor plan design of residential as well as non-residential facilities, tools, such as computer-aided design (CAD) tools, may be used to design a floor plan. Depending on the complexity of the floor plan design, various levels of expertise may be required for utilization of such tools. In an example of a floor plan design, an architect may obtain the requirements from a client in the form of room types, number of rooms, room sizes, plot boundary, the connection between rooms, etc., sketch out rough floor plans and collect feedback from the client, refine the sketched plans, and design and generate the floor plan using CAD tools. The experience of the architect may become a significant factor in the quality of the floor plan design.
Features of the present disclosure are illustrated by way of example and not limited in the following figure(s), in which Ike numerals indicate like elements, in which:
For simplicity and illustrative purposes, the present disclosure is described by referring mainly to examples. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure. It will be readily apparent however, that the present disclosure may be practiced without limitation to these specific details. In other instances, some methods and structures have not been described in detail so as not to unnecessarily obscure the present disclosure.
Throughout the present disclosure, the terms “a” and “an” are intended to denote at least one of a particular element. As used herein, the term “includes” means includes but not limited to, the term “including” means including but not limited to. The term “based on” means based at least in part on.
Stylization-based floor plan generation apparatuses, methods for stylization-based floor plan generation, and non-transitory computer readable media having stored thereon machine readable instructions to provide stylization-based floor plan generation are disclosed herein. The apparatuses, methods, and non-transitory computer readable media disclosed herein provide for generation of a floor plan intuitively with limited knowledge about the design and limited experience with utilization of complex designing tools. The apparatuses, methods, and non-transitory computer readable media disclosed herein provide for floor plan design exploration that is guided by multi-attribute constraints. Further, the apparatuses, methods, and non-transitory computer readable media disclosed herein provide for the transfer of a style of one floor plan to another. In this regard, the apparatuses, methods, and non-transitory computer readable media disclosed herein provide for interactive creation of floor plans by users and/or designers. Yet further, the apparatuses, methods, and non-transitory computer readable media disclosed herein may facilitate interactive floor plan design of a residential or non-residential facility.
For the apparatuses, methods, and non-transitory computer readable media disclosed herein, user inputs in the form of boundary, room types, and spatial relationships may be considered to generate the layout design satisfying these requirements. Based on qualitative and quantitative analysis of metrics such as floor plan layout generation accuracy, realism, and quality, floor plans generated by the apparatuses, methods, and non-transitory computer readable media disclosed herein may provide greater realism and improved quality compared to known techniques.
With respect to floor plan design, a floor plan design for a home or a non-residential building may be perpetually customizable in that the future of the home may understand occupants' needs of space, mood, occasion, and will automatically change itself, and these changes may be perpetual and highly personalized. Further, the floor plan design for a home or a non-residential building may be assistive and protective in that a future home may make necessary accommodations based on specific physical limitations of occupants. The floor plan design for a home or a non-residential building may include a workflow that includes a first step including design ideas where inspiration is obtained from disparate sources, a second step including lifestyle analysis where current home and lifestyle aspects are examined, a third step including sketch design where a rough floor plan is sketched, and a fourth step including computer aided design (CAD) design where CAD tools are used to design the floor plan. Further, with respect to floor plan design, as disclosed herein, tools, such as CAD tools, may be used to design a floor plan. Depending on the complexity of the floor plan design, various levels of expertise may be required for utilization of such tools. In this regard, it is technically challenging to generate a floor plan without expertise in floor plan design or the use of complex designing tools.
In order to address at least the aforementioned technical challenges, the apparatuses, methods, and non-transitory computer readable media disclosed herein may implement a generative model to synthesize floor plans guided by user constraints. User inputs in the form of boundary, room types, and spatial relationships may be analyzed to generate the floor plan design that satisfies these requirements. For example, the apparatuses, methods, and non-transitory computer readable media disclosed herein may receive, as input, a layout graph describing objects (e.g., types of rooms) and their relationships (e.g., connections between rooms, placement of furniture), and generate one or more realistic floor plans corresponding to the layout graph. The apparatuses, methods, and non-transitory computer readable media disclosed herein may utilize a graph convolution network (GCN) to process an input layout graph, which provides embedding vectors for each room type. These vectors may be used to predict bounding boxes and segmentation masks for objects, which are combined to form a space layout. The space layout may be synthesized to an image using a cascaded alignment layer analyzer to generate a floor plan.
In one example, the architecture of the stylization-based floor plan generation apparatus may include a graph convolutional message passing network analyzer, a space layout network analyzer, and a cascaded alignment layer analyzer. The graph convolutional message passing network analyzer may process input graphs and generate embedding vectors for each room type. The space layout network analyzer may predict bounding boxes and segmentation masks for each room embedding, and combine the bounding boxes and the segmentation masks to generate a space layout. The cascaded alignment layer analyzer may synthesize the space layout to generate a floor plan using an input boundary feature map.
The apparatuses, methods, and non-transitory computer readable media disclosed herein may provide an end-to-end trainable network to generate floor plans along with doors and windows from a given input boundary and layout graph. The generated two-dimensional (2D) floor plan may be converted to 2.5D to 3D floor plans. The aforementioned floor plan generation process may also be used to generate floor plans for a single unit or multiple units. For example, in the case of an apartment, a layout of multiple units of different configurations may be generated. The generated floor plan may be utilized to automatically (e.g., without human intervention) control (e.g., by a controller) one or more tools and/or machines related to construction of a structure specified by the floor plan. For example, the tools and/or machines may be automatically guided by the dimensional layout of the floor plan to coordinate and/or verify dimensions and/or configurations of structural features (e.g., walls, doors, windows, etc.) specified by the floor plan. In one example, the generated floor plan may be used to automatically generate 2.5 dimensional (2.5D) or 3D models.
The apparatuses, methods, and non-transitory computer readable media disclosed herein may further provide for the generation of high quality floor plan layouts without any post-processing. For example, compared to known techniques of floor plan generation, the apparatuses, methods, and non-transitory computer readable media disclosed herein may provide a floor plan that is more efficient and easier to build due to the higher quality of the floor plan. In this regard, the apparatuses, methods, and non-transitory computer readable media disclosed herein may provide an end-to-end trainable network to generate floor plans along with doors and windows from a given input boundary and layout graph. The apparatuses, methods, and non-transitory computer readable media disclosed herein may perform stylization of structural elements of a floor plan. For example, the apparatuses, methods, and non-transitory computer eadable media disclosed herein may provide end to end parsing of a floor plan (e.g., CAD or raster format), identify similar floor plans to extract the style, and then apply the style elements to a new boundary to generate the floor plan. In some examples, user inputs (or requirements) in the form of a graph such as a number of rooms, type, size and the input boundary may be analyzed to generate a floor plan based on the user inputs.
For the apparatuses, methods, and non-transitory computer readable media disclosed herein, the elements of the apparatuses, methods, and non-transitory computer readable media disclosed herein may be any combination of hardware and programming to implement the functionalities of the respective elements. In some examples described herein, the combinations of hardware and programming may be implemented in a number of different ways. For example, the programming for the elements may be processor executable instructions stored on a non-transitory machine-readable storage medium and the hardware for the elements may include a processing resource to execute those instructions. In these examples, a computing device implementing such elements may include the machine-readable storage medium storing the instructions and the processing resource to execute the instructions, or the machine-readable storage medium may be separately stored and accessible by the computing device and the processing resource. In some examples, some elements may be implemented in circuitry.
Referring to
A space layout network analyzer 114 that is executed by at least one hardware processor (e.g., the hardware processor 2102 of
A cascaded alignment layer analyzer 124 that is executed by at least one hardware processor (e.g., the hardware processor 2102 of
A computer-aided design (CAD) floor plan parser 128 that is executed by at least one hardware processor (e.g., the hardware processor 2102 of
According to examples disclosed herein, the CAD floor plan parser 128 may parse the CAD floor plan 130 to determine the room layout for the CAD floor plan 130 by extracting, by an encoder 132, a plurality of features from the CAD floor plan 130. The CAD floor plan parser 128 may upsample, by a decoder 134, the extracted plurality of features to generate a segmentation image. Further, the CAD floor plan parser 128 may determine, by an attention component 136 and from the segmentation image, semantic information and target features to generate the room layout for the CAD floor plan 130.
According to examples disclosed herein, the attention component 136 may determine the semantic information and the target features by combining low-level feature maps with high-level feature maps.
According to examples disclosed herein, the attention component 136 may determine the semantic information and the target features by multiplying the low-level feature maps by an attention vector.
A layout graph generator 138 that is executed by at least one hardware processor (e.g., the hardware processor 2102 of
A loss analyzer 140 that is executed by at least one hardware processor (e.g., the hardware processor 2102 of
A similar floor plan identifier 142 that is executed by at least one hardware processor (e.g., the hardware processor 2102 of
According to examples disclosed herein, the graph convolutional message passing network analyzer 102 may generate, based on the layout graph 106, the embedding vectors 112 for each room type of the plurality of room types 110 by utilizing a plurality of embedding layers to embed room types and relationships between rooms to generate vectors of a specified dimension.
According to examples disclosed herein, the space layout network analyzer 114 may determine, for each room embedding 116 from the layout graph 106, and based on the analysis of the embedding vectors 112 for each room type of the plurality of room types 110, the bounding boxes 118 and the segmentation masks 120 by passing the embedding vectors 112 to a box regression network to predict the bounding boxes 118.
According to examples disclosed herein, the space layout network analyzer 114 may generate, by combining the bounding boxes 118 and the segmentation masks 120, the space layout 122 by multiplying an embedding vector for each room type by an associated mask to generate a plurality of masked embedding shapes. The space layout network analyzer 114 may utilize bi-linear interpolation to modify the masked embedding shapes to a position of associated bounding boxes to generate room layouts. Further, the space layout network analyzer 114 may generate, based on a summation of the room layouts, the space layout 122.
A model generator 144 that is executed by at least one hardware processor (e.g., the hardware processor 2102 of
Referring to
With respect to the graph convolutional message passing network analyzer 102, the layout graph 106 may be passed through a series of graph convolution layers (e.g., a message passing network) which generates embedding vectors for each node (e.g., a room). The graph convolutional message passing network analyzer 102 may utilize embedding layers to embed the room types and relationships in the layout graph 106 to produce vectors, for example, of dimension Din=128. Given an input graph with vectors of dimension Din at each node and edge, the graph convolutional message passing network analyzer 102 may determine new vectors of dimension Dout for each node and edge. Output vectors may be a function of a neighborhood of their corresponding inputs so that each graph convolution layer propagates information along edges of the layout graph 106.
The space layout network analyzer 114 may predict the bounding boxes 118 and the segmentation masks 120 for each room embedding 116 from the layout graph 106, and combine the bounding boxes 118 and the segmentation masks 120 to generate the space layout 122. A bounding box may be used to describe the spatial location of an object. A mask may represent a binary image including zero and non-zero values. A space layout may represent an aggregation of bilinear interpolation of a bounding box and a mask for each room type (e.g., node).
The cascaded alignment layer analyzer 124 may synthesize the space layout 122 to generate the floor plan 104 using the input boundary feature map 126. The graph convolutional message passing network analyzer 102, the space layout network analyzer 114, and the cascaded alignment layer analyzer 124 may be trainable to generate, for example, rooms, walls, doors, and windows.
With respect to a scene graph as disclosed herein, the cascaded alignment layer analyzer 124 may receive the input boundary feature map 126 (e.g., B as a 256×256 image). Further, the graph convolutional message passing network analyzer 102 may receive the layout graph 106 with encoded user-constraints G as input. The cascaded alignment layer analyzer 124 may generate the floor plan 104 (e.g., floor plan layout L) as output. In some examples, the input boundary feature map 126 may be represented as a 256×256 image. The nodes of the layout graph 106 may be denoted room types, and the edges may be denoted connections between the rooms. Each node may be represented as a tuple (ri, li, si); where ri∈Rd
Referring to
Referring to
In further detail, the encoder (ResNeXt block) 132 may extract features from the floor plan image 402 and obtain a compact representation of these features through multiple levels. In this regard, a ResNeXt block may be utilized in the encoder 132 to extract features from the floor plan image 402. ResNeXt may repeat a budding block that aggregates a set of transformations with the same topology. Down-sampling may be performed by 2×2 max-pooling operation. During each downsampling, the image size may be reduced and the number of feature channels may be doubled.
With reference to
F′=F⊗σ(fconv(Fsavg,Fsmax)) Equation (1)
In Equation (1), F may represent the input feature map 410, fconv may represent the convolution operation with a filter size, for example, of 4×4, and σ may represent the activation function. The two spatial descriptors may be concatenated to generate a single feature descriptor. A convolution layer may be applied on the concatenated feature descriptor followed by sigmoid activation to generate a spatial attention map. Element-wise multiplication may be performed between the input feature map and the spatial attention map to generate a new feature map 412 focusing on spatial features. During the element-wise multiplication operation, the spatial attention values may be broadcasted.
The decoder 134 may be used to up-sample the extracted feature map from the encoder 132 to generate the segmentation image 404. Upsampling may be performed, for example, by bilinear interpolation. A 1×1 convolutional layer may be applied to predict a class of each pixel. The decoder 134 may be structurally symmetrical with the encoder 132. The copy operation may link the corresponding down-sampling and up-sampling feature maps. The decoder 134 may restore the details and spatial dimensions of an image according to the image features, and obtain the result of the image segmentation mask. The features obtained by the encoder 132 may include less semantic information and may be denoted low-level features, whereas the features obtained by the decoder 134 may be denoted high-level features.
Referring to
With respect to a loss function, a multi-task loss may be applied by the loss analyzer 140 as a training objective. The training objective may learn to predict semantic labels for pixels and regress the locations for interest points. The loss analyzer 140 may learn to determine (e.g., estimate) the pixel-accurate location for all points of interest by means of separate heatmap regression tasks that may be based on mean squared error (MSE). The loss analyzer 140 may also output two segmentation maps. The first segmentation map may be used for segmenting background, rooms, and walls. The second segmentation map may be used for segmenting different icons and openings (e.g., windows and doors). The two segmentation tasks may be trained using cross-entropy loss as follows:
For Equations (2)-(4), yi may represent the label of the ith element in the floor plan, C may represent the number of floor plan elements, and pi may represent the prediction probability of the pixels of the ith element. Ls may represent a cross-entropy loss for the segmentation part and is composed of two cross-entropy terms for room and icon segmentation tasks. Further, LH may be utilized for training heatmap regressors, and yi and ŷi may represent the ground truth heatmap and predicted heatmap of location i. Equations (3) and (4) may be utilized for the loss function during model training.
Operation of the apparatus 100 may be evaluated by utilizing a large-scale floor plan dataset such as Cubicasa5K that includes, for example, 5000 samples annotated into over 80 floor plan object categories. For example, the dataset may include 5000 floor plans (e.g., with user-specified annotations) that are collected and reviewed from a larger set of 15,000 floor plan images. The dataset may be divided into three categories that include high quality architectural, high quality, and colorful floor plans including 3732, 992 and 276 floor plans respectively. The dataset may be divided into training, validation and test sets including 4200, 400, and 400 floor plans respectively. The annotations may be in scalable vector graphics (SVG) format, and include the semantic and geometric annotations for all of the floor plan elements.
Referring to
Referring to
Referring to
Referring to
Referring to
Referring to
For Equation (5), d(G1, G2) may represent the distance between the embedding value of graphs G1 and G2, and d(G1, G3) may represent the distance between the embedding value of graphs G1 and G3. These distance functions may be implemented as Euclidean or cosine. Further, γ may represent a specified threshold value. The similar floor plan identifier 142 may map the node and edge features to initial node and edge vectors through a Multi-layer Perceptron (MLP) as follows:
After ‘t’ iterations, a node's representation may capture dependence from all the nodes within the t-hop neighborhood. Formally, a node v's representations at tth layer may be defined as follows:
For Equations (6)-(8), hv(t) may represent the feature representation of node v at tth layer, mu(t) may represent the transformed message from neighborhood node u, (v) may represent the set of nodes adjacent to v, AGG may represent the aggregation function (implemented as mean of neighborhood messages), V may represent the set of vertices, E may represent the set of edges, and MSG may represent a message transformation function.
Next, the similar floor plan identifier 142 may utilize cross graph message propagation to match the node in one graph to nodes of another graph. Message information may be cross propagated from one node to another node (e.g., cross graph) as follows:
m
i→j
=f
m(hi(t),hj(t),eij) Equation (9)
The node similarity between two graphs may be measured, and the weights may be determined as follows:
si→j=fs(hi(t),hj(t)) Equation (10)
For Equation (10), hi(t) may represent a hidden state of node i of one graph, and hj(t) may represent a hidden state of node j of another graph. A hidden state of a node may be updated based on other information as follows:
Further, a similarity function may be implemented as follows:
For Equation (12), s(hi(t),hj(t)) may represent the Euclidean or cosine similarity between two hidden states. Equations (9)-(12) may be utilized to determine the embedding value for each node.
The results at 1300 determined by the similar floor plan identifier 142 show a score of 0.71, and the results at 1302 show a score of 0.34, thus indicating similarity of a generated floor plan to an existing floor plan.
Referring to
Referring to
With respect to
The embedding vector of each room type vi may be multiplied element-wise with their mask an to generate a masked embedding of shape D*M*M at 1410, which may then be warped to the position of the bounding box using bi-linear interpolation to generate a room layout 1412. Space layout 1414 (e.g., also denoted scene layout) may represent the sum of all of the room layouts. A similar approach may be implemented to generate walls and door masks. During training, ground truth bounding boxes may be utilized for each room type to compare with the predicted bounding boxes. However, during inference time, the predicted bounding boxes hi may be utilized.
With respect to image synthesizing, the cascaded alignment layer analyzer 124 may synthesize a rasterized space layout 122 from the layout graph 106 and the input boundary feature map 126. The input boundary may be passed through a series of convolution and pooling layers to obtain the input boundary feature map 126. The cascaded alignment layer analyzer 124 may implement a series of convolutional alignment modules (CAM). A CAM may receive as input the space layout 122 and the input boundary feature map 126, and generate a new feature map which is twice the spatial size of the input boundary feature map 126. Each CAM may upsample the input boundary feature map 126 by a factor of two, and downsample the space layout 122 using average pooling to match the size of the upsampled feature map. The upsampled input boundary feature map 126 and the downsampled space layout 122 may be passed through a region of interest alignment layer, and further processed with two convolution layers to generate the floor plan 104. Region of interest Align, or RolAlign, may be an operation for extracting a small feature map from each region of interest (e.g., in detection and segmentation-based tasks). The RolAlign may be implemented to properly align the extracted features with the input. The RolAlign may use bilinear interpolation to compute the exact values of the input features.
With respect to loss function, the space layout network analyzer 114 may be trained to minimize the weighted sum of four losses. For example, bounding box loss (Lb) may determine the L2 difference between ground truth and predicted bounding boxes. Mask loss (Lm) may determine the L2 difference between ground truth and predicted masks. Pixel loss (Lp) may determine the L2 difference between ground-truth and generated images. Overlap loss (Lo) may determine the overlap between the predicted room bounding boxes. The overlap between room bounding boxes may be specified to be as small as possible. Loss may be determined, for example, as: LT=λbLb+λmLm+λpLp+λoLo, where λb=λm=λp=λb0=1.
The training dataset may include, for example, several thousand vector-graphics floor plans of residential (and/or non-residential) buildings designed by architects. Each floor plan may be represented as a four channel image. The first channel may store inside mask, the second channel may store boundary mask, the third channel may store wall mask, and the fourth channel may store room mask.
Referring to
With respect to the graph convolutional message passing network analyzer 102, a graph neural network (GNN) of the graph convolutional message passing network analyzer 102 may represent a deep neural network that uses a graph data structure to capture the dependence of data. The GNN may adopt a message-passing strategy to update the representation of a node by aggregating transformed messages (representations) of its neighboring nodes. After ‘t’ iterations of message passing, a node's representation may capture dependence from all the nodes within the t-hop neighborhood. Formally, a node v's representations at tth layer may be defined as follows:
In this regard, h(t) may represent the feature representation of node v at tth layer, m(t) may represent the transformed message from neighborhood node u, and N (v) may represent the set of nodes adjacent to v. MSG may represent the message transformation at a particular node, and AGG may represent the aggregation function to capture the messages from neighboring nodes.
Referring to
The processor 2102 of
Referring to
The processor 2102 may fetch, decode, and execute the instructions 2108 to generate, based on the layout graph 106, embedding vectors 112 for each room type of the plurality of room types 110.
The processor 2102 may fetch, decode, and execute the instructions 2110 to determine, for each room embedding 116 from the layout graph 106, and based on an analysis of the embedding vectors 112 for each room type of the plurality of room types 110, bounding boxes 118 and segmentation masks 120.
The processor 2102 may fetch, decode, and execute the instructions 2112 to generate, by combining the bounding boxes 118 and the segmentation masks 120, a space layout 122.
The processor 2102 may fetch, decode, and execute the instructions 2114 to receive an input boundary feature map 126.
The processor 2102 may fetch, decode, and execute the instructions 2116 to generate, based on an analysis of the space layout 122 and the input boundary feature map 126, the floor plan 104.
Referring to
At block 2204, the method may include generating, based on the layout graph 106, a space layout 122.
At block 2206, the method may include receiving an input boundary feature map 126.
At block 2208, the method may include generating, based on an analysis of the space layout 122 and the input boundary feature map 126, the floor plan 104.
Referring to
The processor 2304 may fetch, decode, and execute the instructions 2308 to generate, based on the layout graph 106, a space layout 122.
The processor 2304 may fetch, decode, and execute the instructions 2310 to generate, based on an analysis of the space layout 122 and the input boundary feature map 126, the floor plan 104.
What has been described and illustrated herein is an example along with some of its variations. The terms, descriptions and figures used herein are set forth by way of illustration only and are not meant as limitations. Many variations are possible within the spirit and scope of the subject matter, which is intended to be defined by the following claims—and their equivalents—in which all terms are meant in their broadest reasonable sense unless otherwise indicated.
Number | Date | Country | Kind |
---|---|---|---|
202211007031 | Feb 2022 | IN | national |
202211039342 | Jul 2022 | IN | national |