This disclosure relates generally to inverse design of physical devices, and in particular but not exclusively, relates to inverse design of photonic devices.
Fiber-optic communication is typically employed to transmit information from one place to another via light that has been modulated to carry the information. For example, many telecommunication companies use optical fiber to transmit telephone signals, internet communication, and cable television signals. But the cost of deploying optical fibers for fiber-optic communication may be prohibitive. As such, techniques have been developed to more efficiently use the bandwidth available within a single optical fiber. Wavelength-division multiplexing is one such technique that bundles multiple optical carrier signals onto a single optical fiber using different wavelengths.
In some embodiments, a non-transitory computer-readable medium having logic stored thereon is provided. The logic, in response to execution by one or more processors of a computing system, causes the computing system to perform actions for designing a physical device. The actions include the computing system generating an initial design based on a design specification, wherein the initial design includes a list of features. The actions also include, for each feature of the list of features, determining whether the feature is present in a set of structural parameters by: in response to determining a feature presence function indicates that the feature should be included in the set of structural parameters, updating the set of structural parameters to include the feature; and in response to determining the feature presence function indicates that the feature should not be included in the set of structural parameters, refraining from updating the set of structural parameters to include the feature. The actions also include simulating, by the computing system, performance of the initial design using the set of structural parameters to determine a performance loss value; determining, by the computing system, a structural gradient based on the performance loss value; determining, by the computing system, a feature gradient based on the performance loss value; and updating, by the computing system, the features in the list of features based on the structural gradient and the feature gradient.
In some embodiments, a computer-implemented method for designing a physical device is provided. A computing system generates an initial design based on a design specification. The initial design includes a list of features. For each feature of the list of features, the computing system determines whether the feature is present in a set of structural parameters by, in response to determining a feature presence function indicates that the feature should be included in the set of structural parameters, updating the set of structural parameters to include the feature; and in response to determining the feature presence function indicates that the feature should not be included in the set of structural parameters, refraining from updating the set of structural parameters to include the feature. The computing system simulates performance of the initial design using the set of structural parameters to determine a performance loss value. The computing system determines a structural gradient based on the performance loss value. The computing system determines a feature gradient based on the performance loss value. The computing system updates the features in the list of features based on the structural gradient and the feature gradient.
Non-limiting and non-exhaustive embodiments of the invention are described with reference to the following figures, wherein like reference numerals refer to like parts throughout the various views unless otherwise specified. Not all instances of an element are necessarily labeled so as not to clutter the drawings where appropriate. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles being described. To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the figure number in which that element is first introduced.
Embodiments of techniques for inverse design of physical devices are described herein, in the context of generating designs for photonic integrated circuits (including a multi-channel photonic demultiplexer or multiplexer). In the following description numerous specific details are set forth to provide a thorough understanding of the embodiments. One skilled in the relevant art will recognize, however, that the techniques described herein can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring certain aspects.
Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Wavelength division multiplexing and its variants (e.g., dense wavelength division multiplexing, coarse wavelength division multiplexing, and the like) take advantage of the bandwidth of optical fibers by bundling multiple optical carrier signals onto a single optical fiber. Once the multiple carrier signals are bundled together, they are transmitted from one place to another over the single optical fiber where they may be demultiplexed to be read out by an optical communication device. However, devices that decouple the carrier signals from one another remain prohibitive in terms of cost, size, and the like.
Moreover, design of photonic devices, such as those used for optical communication, are traditionally designed via conventional techniques sometimes determined through a simple guess and check method or manually-guided grid-search in which a small number of design parameters from pre-determined designs or building blocks are adjusted for suitability to a particular application. However, in actuality, these devices may have design parameters ranging from hundreds all the way to many billions or more, dependent on the device size and functionality. Thus, as functionality of photonic devices increases and manufacturing tolerances improve to allow for smaller device feature sizes, it becomes increasingly important to take full advantage of these improvements via optimized device design.
Described herein are embodiments of a photonic integrated circuit (e.g., a multi-channel photonic demultiplexer and/or multiplexer) having a design obtainable by an inverse design process, and techniques for the design thereof. More specifically, techniques described in embodiments herein utilize gradient-based optimization in combination with first-principle simulations to generate a design from an understanding of the underlying physics that are expected to govern the operation of the photonic integrated circuit. It is appreciated in other embodiments, design optimization of photonic integrated circuits without gradient-based techniques may also be used. Advantageously, embodiments and techniques described herein are not limited to conventional techniques used for design of photonic devices, in which a small number of design parameters for pre-determined building blocks are adjusted based on suitability to a particular application. Rather, the first-principles based designs described herein are not necessarily dependent on human intuition and generally may result in designs which outstrip current state-of-the-art designs in performance, size, robustness, or a combination thereof. Further still, rather than being limited to a small number of design parameters due to conventional techniques, the embodiments and techniques described herein may provide scalable optimization of a nearly unlimited number of design parameters. It will also be appreciated that, though the design and fabrication of photonic integrated circuits is described throughout the present text, similar inverse design techniques may be used to generate designs for other types of physical devices.
In the illustrated embodiment, optical communication device 102 includes a controller 104, one or more interface device(s) 112 (e.g., fiber optic couplers, light guides, waveguides, and the like), a multiplexer (mux), demultiplexer (demux), or combination thereof (MUX/DEMUX 114), one or more light source(s) 116 (e.g., light emitting diodes, lasers, and the like), and one or more light sensor(s) 118 (e.g., photodiodes, phototransistors, photoresistors, and the like) coupled to one another. The controller includes one or more processor(s) 106 (e.g., one or more central processing units, application specific circuits, field programmable gate arrays, or otherwise) and memory 108 (e.g., volatile memory such as DRAM and SAM, non-volatile memory such as ROM, flash memory, and the like). It is appreciated that optical communication device 120 may include the same or similar elements as optical communication device 102, which have been omitted for clarity.
Controller 104 orchestrates operation of optical communication device 102 for transmitting and/or receiving optical signal 110 (e.g., a multi-channel optical signal having a plurality of distinct wavelength channels or otherwise). Controller 104 includes software (e.g., instructions included in memory 108 coupled to processor 106) and/or hardware logic (e.g., application specific integrated circuits, field-programmable gate arrays, and the like) that when executed by controller 104 causes controller 104 and/or optical communication device 102 to perform operations.
In one embodiment, controller 104 may choreograph operations of optical communication device 102 to cause light source(s) 116 to generate a plurality of distinct wavelength channels that are multiplexed via MUX/DEMUX 114 into a multi-channel optical signal 110 that is subsequently transmitted to optical communication device 120 via interface device 112. In other words, light source(s) 116 may output light having different wavelengths (e.g., 1271 nm, 1291 nm, 1311 nm, 1331 nm, 1506 nm, 1514 nm, 1551 nm, 1571, or otherwise) that may be modulated or pulsed via controller 104 to generate a plurality of distinct wavelength channels representative of information. The plurality of distinct wavelength channels are subsequently combined or otherwise multiplexed via MUX/DEMUX 114 into a multi-channel optical signal 110 that is transmitted to optical communication device 120 via interface device 112. In the same or another embodiment, controller 104 may choreograph operations of optical communication device 102 to cause a plurality of distinct wavelength channels to be demultiplexed via MUX/DEMUX 114 from a multi-channel optical signal 110 that is received via interface device 112 from optical communication device 120.
It is appreciated that in some embodiments certain elements of optical communication device 102 and/or optical communication device 120 may have been omitted to avoid obscuring certain aspects of the disclosure. For example, optical communication device 102 and optical communication device 120 may include amplification circuitry, lenses, or components to facilitate transmitting and receiving optical signal 110. It is further appreciated that in some embodiments optical communication device 102 and/or optical communication device 120 may not necessarily include all elements illustrated in
As illustrated in
In the illustrated embodiment of
As illustrated in
In the illustrated embodiment each of the plurality of output regions 304 are parallel to each other one of the plurality of output regions 304. However, in other embodiments the plurality of output regions 304 may not be parallel to one another or even disposed on the same side (e.g., one or more of the plurality of output regions 304 and/or input region 302 may be disposed proximate to sides of dispersive region 332 that are adjacent to first side 328 and/or second side 330). In some embodiments adjacent ones of the plurality of output regions are separated from each other by a common separation distance when the plurality of output regions includes at least three output regions. For example, as illustrated adjacent output region 308 and output region 310 are separated from one another by distance 306, which may be common to the separation distance between other pairs of adjacent output regions.
As illustrated in the embodiment of
It is noted that the first material and second material of dispersive region 332 are arranged and shaped within the dispersive region such that the material interface pattern is substantially proportional to a design obtainable with an inverse design process, which will be discussed in greater detail later in the present disclosure. More specifically, in some embodiments, the inverse design process may include iterative gradient-based optimization of a design based at least in part on a loss function that incorporates a performance loss (e.g., to enforce functionality) and a fabrication loss (e.g., to enforce fabricability and binarization of a first material and a second material) that is reduced or otherwise adjusted via iterative gradient-based optimization to generate the design. In the same or other embodiment, other optimization techniques may be used instead of, or jointly with, gradient-based optimization. Advantageously, this allows for optimization of a near unlimited number of design parameters to achieve functionality and performance within a predetermined area that may not have been possible with conventional design techniques.
For example, in one embodiment dispersive region 332 is structured to optically separate each of the four channels from the multi-channel optical signal within a predetermined area of 35 μm×35 μm (e.g., as defined by width 324 and length 326 of dispersive region 332) when the input region 302 receives the multi-channel optical signal. In the same or another embodiment, the dispersive region is structured to accommodate a common bandwidth for each of the four channels, each of the four channels having different center wavelengths. In one embodiment the common bandwidth is approximately 13 nm wide and the different center wavelengths is selected from a group consisting of 1271 nm, 1291 nm, 1311 nm, 1331 nm, 1506 nm, 1514 nm, 1551 nm, and 1571 nm. In some embodiments, the entire structure of demultiplexer 316 (e.g., including input region 302, periphery region 318, dispersive region 332, and plurality of output regions 304) fits within a predetermined area (e.g., as defined by width 320 and length 322). In one embodiment the predetermined area is 35 μm×35 μm. It is appreciated that in other embodiments dispersive region 332 and/or demultiplexer 316 fits within other areas greater than or less than 35 μm×35 μm, which may result in changes to the structure of dispersive region 332 (e.g., the arrangement and shape of the first and second material) and/or other components of demultiplexer 316.
In the same or other embodiments the dispersive region is structured to have a power transmission of −2 dB or greater from the input region 302, through the dispersive region 332, and to the corresponding one of the plurality of output regions 304 for a given wavelength within one of the plurality of distinct wavelength channels. For example, if channel 1 of a multi-channel optical signal is mapped to output region 308, then when demultiplexer 316 receives the multi-channel optical signal at input region 302 the dispersive region 332 will optically separate channel 1 from the multi-channel optical signal and guide a portion of the multi-channel optical signal corresponding to channel 1 to output region 308 with a power transmission of −2 dB or greater. In the same or another embodiment, dispersive region 332 is structured such that an adverse power transmission (i.e., isolation) for the given wavelength from the input region to any of the plurality of output regions other than the corresponding one of the plurality of output regions is −30 dB or less, −22 dB or less, or otherwise. For example, if channel 1 of a multi-channel optical signal is mapped to output region 308, then the adverse power transmission from input region 302 to any other one of the plurality of output regions (e.g., output region 310, output region 312, output region 314) other than the corresponding one of the plurality of output regions (e.g., output region 308) is −30 dB or less, −22 dB or less, or otherwise. In some embodiments, a maximum power reflection from demultiplexer 316 of an input signal (e.g., a multi-channel optical signal) received at an input region (e.g., input region 302) is reflected back to the input region by dispersive region 332 or otherwise is −40 dB or less, −20 dB or less, −8 dB or less, or otherwise. It is appreciated that in other embodiments the power transmission, adverse power transmission, maximum power, or other performance characteristics may be different than the respective values discussed herein, but the structure of dispersive region 332 may change due to the intrinsic relationship between structure, functionality, and performance of demultiplexer 316.
In one embodiment a silicon on insulator (SOI) wafer may be initially provided that includes a support substrate (e.g., a silicon substrate) that corresponds to substrate 334, a silicon dioxide dielectric layer that corresponds to dielectric layer 336, a silicon layer (e.g., intrinsic, doped, or otherwise), and a oxide layer (e.g., intrinsic, grown, or otherwise). In one embodiment, the silicon in the active layer 338 may be etched selectively by lithographically creating a pattern on the SOI wafer that is transferred to SOI wafer via a dry etch process (e.g., via a photoresist mask or other hard mask) to remove portions of the silicon. The silicon may be etched all the way down to dielectric layer 336 to form voids that may subsequently be backfilled with silicon dioxide that is subsequently encapsulated with silicon dioxide to form cladding layer 340. In one embodiment, there may be several etch depths including a full etch depth of the silicon to obtain the targeted structure. In one embodiment, the silicon may be 206 nm thick and thus the full etch depth may be 206 nm. In some embodiments, this may be a two-step encapsulation process in which two silicon dioxide depositions are performed with an intermediate chemical mechanical planarization used to yield a planar surface.
It is appreciated that in the illustrated embodiments of demultiplexer 316 as shown in
As illustrated in
The first material 410 (i.e., black colored regions within dispersive region 406) and second material 412 (i.e., white colored regions within dispersive region 406) of photonic demultiplexer 400 are inhomogeneously interspersed to create a plurality of interfaces that collectively form material interface pattern 420 as illustrated in
As illustrated in
In some embodiments, material interface pattern 420 includes one or more dendritic shapes, wherein each of the one or more dendritic shapes are defined as a branched structure formed from first material 410 or second material 412 and having a width that alternates between increasing and decreasing in size along a corresponding direction. Referring back to
In some embodiments, the inverse design process includes a fabrication loss that enforces a minimum feature size, for example, to ensure fabricability of the design. In the illustrated embodiment of photonic demultiplexer 400 illustrated in
As illustrated, computing system 500 includes controller 512, display 502, input device(s) 504, communication device(s) 506, network 508, remote resources 510, bus 534, and bus 520. Controller 512 includes processor 514, memory 516, local storage 518, and photonic device simulator 522. Photonic device simulator 522 includes operational simulation engine 526, fabrication loss calculation logic 528, calculation logic 524, adjoint simulation engine 530, and optimization engine 532. It is appreciated that in some embodiments, controller 512 may be a distributed system.
Controller 512 is coupled to display 502 (e.g., a light emitting diode display, a liquid crystal display, and the like) coupled to bus 534 through bus 520 for displaying information to a user utilizing computing system 500 to optimize structural parameters of the photonic device (i.e., demultiplexer). Input device 504 is coupled to bus 534 through bus 520 for communicating information and command selections to processor 514. Input device 504 may include a mouse, trackball, keyboard, stylus, or other computer peripheral, to facilitate an interaction between the user and controller 512. In response, controller 512 may provide verification of the interaction through display 502.
Another device, which may optionally be coupled to controller 512, is a communication device 506 for accessing remote resources 510 of a distributed system via network 508. Communication device 506 may include any of a number of networking peripheral devices such as those used for coupling to an Ethernet, Internet, or wide area network, and the like. Communication device 506 may further include a mechanism that provides connectivity between controller 512 and the outside world. Note that any or all of the components of computing system 500 illustrated in
Controller 512 orchestrates operation of computing system 500 for optimizing structural parameters of the photonic device. Processor 514 (e.g., one or more central processing units, graphics processing units, and/or tensor processing units, etc.), memory 516 (e.g., volatile memory such as DRAM and SRAM, non-volatile memory such as ROM, flash memory, and the like), local storage 518 (e.g., magnetic memory such as computer disk drives), and the photonic device simulator 522 are coupled to each other through bus 520. Controller 512 includes software (e.g., instructions included in memory 516 coupled to processor 514) and/or hardware logic (e.g., application specific integrated circuits, field-programmable gate arrays, and the like) that when executed by controller 512 causes controller 512 or computing system 500 to perform operations. The operations may be based on instructions stored within any one of, or a combination of, memory 516, local storage 518, physical device simulator 522, and remote resources 510 accessed through network 508.
In the illustrated embodiment, the components of photonic device simulator 522 are utilized to optimize structural parameters of the photonic device (e.g., MUX/DEMUX 114 of
As illustrated in
Each of the plurality of voxels 612 may be associated with a structural value, a field value, and a source value. Collectively, the structural values of the simulated environment 606 describe the structural parameters of the photonic device. In one embodiment, the structural values may correspond to a relative permittivity, permeability, and/or refractive index that collectively describe structural (i.e., material) boundaries or interfaces of the photonic device (e.g., material interface pattern 420 of
In the illustrated embodiment, the photonic device corresponds to an optical demultiplexer having a design region 614 (e.g., corresponding to dispersive region 332 of
However, in other embodiments, the entirety of the photonic device may be placed within the design region 614 such that the structural parameters may represent any portion or the entirety of the design of the photonic device. The electric and magnetic fields within the simulated environment 606 (and subsequently the photonic device) may change (e.g., represented by field values of the individual voxels that collectively correspond to the field response of the simulated environment) in response to the excitation source. The output ports 604 of the optical demultiplexer may be used for determining a performance metric of the photonic device in response to the excitation source (e.g., power transmission from input port 602 to a specific one of the output ports 604). The initial description of the photonic device, including initial structural parameters, excitation source, performance parameters or metrics, and other parameters describing the photonic device, are received by the system (e.g., computing system 500 of
Once the operational simulation reaches a steady state (e.g., changes to the field values in response to the excitation source substantially stabilize or reduce to negligible values) or otherwise concludes, one or more performance metrics may be determined. In one embodiment, the performance metric corresponds to the power transmission at a corresponding one of the output ports 604 mapped to the distinct wavelength channel being simulated by the excitation source. In other words, in some embodiments, the performance metric represents power (at one or more frequencies of interest) in the target mode shape at the specific locations of the output ports 604. A loss value or metric of the input design (e.g., the initial design and/or any refined design in which the structural parameters have been updated) based, at least in part, on the performance metric may be determined via a loss function. The loss metric, in conjunction with an adjoint simulation, may be utilized to determine a structural gradient (e.g., influence of structural parameters on loss metric) for updating or otherwise revising the structural parameters to reduce the loss metric (i.e. increase the performance metric). It is noted that the loss metric may be further based on a fabrication loss value that is utilized to enforce a minimum feature size of the photonic device to promote fabricability of the device, and/or other loss values.
In some embodiments, iterative cycles of performing the operational simulation, and adjoint simulation, determining the structural gradient, and updating the structural parameters to reduce the loss metric are performed successively as part of an inverse design process that utilizes iterative gradient-based optimization. An optimization scheme such as gradient descent may be utilized to determine specific amounts or degrees of changes to the structural parameters of the photonic device to incrementally reduce the loss metric. More specifically, after each cycle the structural parameters are updated (e.g., optimized) to reduce the loss metric. The operational simulation, adjoint simulation, and updating the structural parameters are iteratively repeated until the loss metric substantially converges or is otherwise below or within a threshold value or range such that the photonic device provides the desired performed while maintaining fabricability.
In previous techniques, the designs for a physical device may be parameterized directly using voxels 612 to represent both an initial design and structural parameters to be simulated within the simulated environment. While voxel-based parameterization can lead to non-intuitive and detailed designs such as those illustrated in
In some embodiments of the present disclosure, instead of using a voxel-based parameterization that requires evaluation of each voxel in the design, features included in the design for the physical device are parameterized using one or more geometric shape primitives, where each geometric shape primitive is large in comparison to the voxels of the structural parameters. By using significantly fewer parameterized features than the voxels of the structural parameters, the computing resources used to optimize the design are greatly reduced. Further, it has been found that the use of features parameterized using geometric shape primitives can cause the performance loss value to converge after fewer iterations of the optimization loop compared to the more detailed voxel-based parameterization.
In the illustrated embodiment, the features are represented by geometric shape primitives 704-726 that are circles. In other embodiments, features may be represented using other techniques including but not limited to other types of geometric shape primitives, including but not limited to rectangles, higher-order polygons, or other types of geometric shape primitives. As will be seen, using circles (or other simple geometric shapes) as the geometric shape primitives 704-726 can lead to various efficiencies. For example, each of the geometric shape primitives 704-726 can be defined uniquely within the design region 702 with a small number of data points. The geometric shape primitive 726 is labeled with its defining data points: coordinates of a center of the geometric shape primitive 726 within a plane of the design region 702, illustrated as {,
}, and a radius of the geometric shape primitive 726, illustrated as
. As can be seen, the entire geometric shape primitive 726 can be represented with three scalar values. This is a vast improvement over the voxel-based parameterization, in which each voxel within the geometric shape primitive 726 would be represented with its own value.
As illustrated in
In some embodiments, the initial design 834 includes a parameterization of the design. The parameters representing the design are optimized by the remainder of the operational simulation 802 and the adjoint simulation 804 in order to generate a design for the physical device that is highly performant. One non-limiting example parameterization is the voxel-based parameterization illustrated in
It is appreciated that the initial design 834 may be a relative term. Thus, in some embodiments an initial design 834 may be a first description of the physical device described within the context of the simulated environment (e.g., a first input design for performing a first operational simulation). However, in other embodiments, the term initial design 834 may refer to an initial design 834 of a particular iteration of an optimization loop (e.g., of performing an operational simulation 802, operating an adjoint simulation 804, and updating the structural parameters). In such an embodiment, the initial design 834 or design of that particular cycle may correspond to a revised description or refined design (e.g., generated from a previous cycle). In some embodiments, the simulated environment includes a design region that includes a portion of the plurality of voxels which have structural parameters that may be updated, revised, or otherwise changed to optimize the structural parameters of the physical device. In the same or other embodiments, the structural parameters are associated with geometric boundaries and/or material compositions of the physical device based on the material properties (e.g., relative permittivity, index of refraction, etc.) of the simulated environment.
In some embodiments, after determining the initial design 834, the operational simulation 802 generates a set of structural parameters 806 to be simulated within the simulated environment. Typically, the initial design 834 is provided as a list of features (e.g., a list of geometric shape primitives as discussed above), and the simulated environment uses a set of structural parameters such as the voxels 612 illustrated in
As stated above, the design specification may indicate a number of features to be included in the initial design 834. However, one difficult aspect of optimizing the initial design 834 is that the number of features indicated by the design specification may not be an optimal number of features to be included. For example, the number of features indicated by the design specification may be greater than an optimal number of features. As such, it is desirable to organize the optimization technique to account for changing the number of features included within the design.
Accordingly, in some embodiments, the operational simulation 802 uses a feature presence function 850 to determine whether each feature of the list of features from the initial design 834 should be included within the set of structural parameters 806 to be simulated. If the feature presence function 850 indicates that a given feature should be included within the design, then the structural parameters 806 are adjusted to include the given feature. Likewise, if the feature presence function 850 does not indicate that the given feature should be included within the design, then given feature is ignored when updating the structural parameters 806. Further discussion of techniques for converting the initial design 834 to structural parameters 806 using a feature presence function 850 are illustrated and discussed with respect to
After the structural parameters 806 are determined, the operational simulation 802 proceeds to a simulation portion 848. The structural parameters 806 represent the physical structure of the physical device to be simulated, and may be represented by voxels 612 (or another format suitable for processing by the simulated environment) regardless of the specific parameterization provided by the initial design 834.
The simulation portion 848 occurs over a plurality of time-steps (e.g., from an initial time step to a final time step over a pre-determined or conditional number of time steps having a specified time step size) and models changes (e.g., from the initial field values 810) in electric and magnetic fields of a plurality of voxels describing the simulated environment and/or photonic device that collectively correspond to the field response. More specifically, update operations (e.g., update operation 812, update operation 814, and update operation 816) are iterative and based on the field response, structural parameters 806, and one or more excitation sources 808. Each update operation is succeeded by another update operation, which are representative of successive steps forward in time within the plurality of time steps. For example, update operation 814 updates the field values 838 (see, e.g.,
Once the final time step of the simulation portion 848 is performed, a performance loss function 818 is used to determine a performance loss value 820 associated with the initial design 834. In some embodiments, additional loss values, including but not limited to a fabrication loss value that is based on whether portions of the structural parameters 806 (and/or the initial design 834) are detected as violating one or more fabricability constraints, may be combined with the one or more performance loss values 820.
From the loss metric 822, loss gradients of the performance loss function may be determined at block 824. The loss gradients determined from block 824 may be treated as adjoint or virtual sources (e.g., physical stimuli or excitation source originating at an output region or port) which are backpropagated in reverse (from the final time step incrementally through the plurality of time steps until reaching the initial time step via update operation 826, update operation 830, and update operation 828) to determine a structural gradient 832. In some embodiments, a feature gradient 852 may also be determined based on the update operations. One aspect of a feature presence function 850 that either includes or excludes features is that this type of feature presence function 850 is poorly differentiable, if at all. In other words, there is a discontinuity in the feature presence function 850 around the region where the feature transitions from being included to being excluded. As such, gradients typically do not flow through the feature presence function 850 effectively in order to allow the number of features included in the design to be adjusted by the optimization process. Accordingly, a separate feature gradient 852 that is different from the feature presence function 850 may be used by the adjoint simulation 804 to update the values in the features that are used by the feature presence function 850. Further descriptions of the uses of the feature presence function 850 and the feature gradient 852 are provided below.
In the illustrated embodiment, the FDTD solve (e.g., simulation portion 848 of the operational simulation 802) and backward solve (e.g., adjoint simulation 804) problem are described pictorially, from a high-level, using only “update” and “loss” operations as well as their corresponding gradient operations. The simulation is set up initially in which the structural parameters, physical stimuli (i.e., excitation source), and initial field states of the simulated environment (and photonic device) are provided (e.g., via an initial description and/or input design). As discussed previously, the field values are updated in response to the excitation source based on the structural parameters. More specifically, the update operation is given by ϕ.
where =ϕ(
,
i,
) for
=1, . . . ,
. Here,
corresponds to the total number of time steps (e.g., the plurality of time steps) for the operational simulation, where
corresponds to the field response (the field value associated with the electric and magnetic fields of each of the plurality of voxels) of the simulated environment at time step
,
corresponds to the excitation source(s) (the source value associated with the electric and magnetic fields for each of the plurality of voxels) of the simulated environment at time step
, and
corresponds to the structural parameters describing the topology and/or material properties of the physical device (e.g., relative permittivity, index of refraction, and the like).
It is noted that using the FDTD method, the update operation may specifically be stated as:
That is to say the FDTD update is linear with respect to the field and source terms. Concretely, A()∈
and B(
)∈
are linear operators which depend on the structure parameters,
, and act on the fields,
, and the sources,
, respectively. Here, it is assumed that
,
∈
where N is the number of FDTD field components in the operational simulation. Additionally, the loss operation (e.g., loss function) may be given by L=f(
, . . . ,
n), which takes as input the computed fields and produces a single, real-valued scalar (e.g., the loss metric) that can be reduced and/or minimized.
In terms of revising or otherwise optimizing the structural parameters of the physical device, the relevant quantity to produce is dL/d, which is used to describe the influence of changes in the structural parameters of the initial design 834 on the loss value and is denoted as the structural gradient 832 illustrated in
, which include
The update operation 814 of the operational simulation 802 updates the field values 838, , of the plurality of voxels at the
th time step to the next time step (i.e.,
+1 time step), which correspond to the field values 840,
. The gradients 842 are utilized to determine ∂L/∂xi for the backpropagation (e.g., update operation 830 backwards in time), which combined with the gradients 844 are used, at least in part, to calculate the structural gradient,
is the contribution of each field to the loss metric, L. It is noted that this is the partial derivative, and therefore does not take into account the causal relationship of →
. Thus,
is utilized which encompasses the →
relationship. The loss gradient, dL/d
, may also be used to compute the structural gradient, dL/d
and corresponds to the total derivative of the field with respect to loss value, L. The loss gradient, dL/d
, at a particular time step,
, is equal to the summation of
Finally, ∂xi/∂, which corresponds to the field gradient, is used which is the contribution to dL/d
from each time/update step.
In particular, the memory footprint to directly compute
is so large that it is difficult to store more than a handful of state Tensors. The state Tensor corresponds to storing the values of all of the FDTD cells (e.g., the plurality of voxels) for a single simulation time step. It is appreciated that the term “tensor” may refer to tensors in a mathematical sense or as described by the TensorFlow framework developed by Alphabet, Inc. In some embodiments the term “tensor” refers to a mathematical tensor which corresponds to a multidimensional array that follows specific transformation laws. However, in most embodiments, the term “tensor” refers to TensorFlow tensors, in which a tensor is described as a generalization of vectors and matrices to potentially higher dimensions (e.g., n-dimensional arrays of base data types), and is not necessarily limited to specific transformation laws. For example, for the general loss function ƒ, it may be necessary to store the fields, , for all time steps,
. This is because, for most choices of ƒ, the gradient will be a function of the arguments of ƒ. This difficulty is compounded by the fact that the values of ∂L/∂
for larger values of
are needed before the values for smaller
due to the incremental updates of the field response and/or through backpropagation of the loss metric, which may prevent the use of schemes that attempt to store only the values ∂L/∂
, at an immediate time step.
An additional difficulty is further illustrated when computing the structural gradient, dL/d, which is given by:
For completeness, the full form of the first term in the sum, dL/d, is expressed as:
Based on the definition of ϕ as described by equation (1), it is noted that
which can be substituted in equation (3) to arrive at an adjoint update for backpropagation (e.g., the update operations such as update operation 830), which can be expressed as:
The adjoint update is the backpropagation of the loss gradient (e.g., from the loss metric) from later to earlier time steps and may be referred to as a backwards solve for dL/d. More specifically, the loss gradient may initially be based upon the backpropagation of a loss metric determined from the operational simulation with the loss function. The second term in the sum of the structural gradient, dL/d
, corresponds to the field gradient and is denoted as:
for the particular form of ϕ described by the first equation above. Thus, each term of the sum associated dL/d depends on both
Since the dependency chains of these two terms are in opposite directions, it is concluded that computing dL/d in this way requires the storage of
values for all of
. In some embodiments, the need to store all field values may be mitigated by a reduced representation of the fields.
It is appreciated that method 900 is an inverse design process that may be accomplished by performing operations with a system to perform iterative gradient-based optimization of a loss metric determined from a loss function that includes at least a performance loss, similar to that illustrated and described in
From a start block, the method 900 proceeds to block 902, where a design specification of a physical device such as a photonic integrated circuit is received. In some embodiments, the physical device may be expected to have a certain functionality (e.g., perform as an optical demultiplexer, an optical multiplexer, an optical waveguide bend, or another type of optoelectronic component) after optimization. In some embodiments, the design specification may indicate an overall structure of the physical device (e.g., dimensions of a design region, initial locations and numbers of one or more input ports and/or one or more output ports), desired performance of the device (e.g., desired performance characteristics at each input port and/or output port), one or more fabricability constraints (e.g., a minimum feature size, a minimum distance, a boundary buffer size, etc.) associated with a fabrication system (e.g., a photolithography system or collection of systems that includes a photolithography system) to be used to fabricate the physical device, a starting number of features to be included in a list of features, and/or any other relevant specification.
At block 904, an initial design 834 is generated that includes a list of features based on the design specification. In some embodiments, the features of the list of features may be specified using geometric shape primitives. In some embodiments, the type of geometric shape primitive (e.g., circle, square, rectangle, higher-order polygon, etc.) may be indicated by the design specification. In some embodiments, a number of geometric shape primitives to be included in the initial design 834 may be indicated in the design specification. In some embodiments, the geometric shape primitives may be randomly sized and randomly positioned within the design region of the initial design 834. In some embodiments, the geometric shape primitives of the initial design 834 may be of a default size and/or positioned at default or regular positions within the design region of the initial design 834. In some embodiments, the geometric shape primitives of the initial design 834 may be arranged to comply with the fabricability constraints associated with the fabrication system. In some embodiments, the geometric shape primitives may be arranged regardless of the fabricability constraints, with fabricability to be achieved during the optimization process.
Once the initial design 834 is generated at block 904, the method 900 advances through a continuation terminal (“terminal B”) to a for-loop defined between a for-loop start block 906 and a for-loop end block 916. In the for-loop, each feature in the list of features is processed to either be added to the structural parameters or ignored based on the feature presence function 850.
To process a feature, the method 900 advances from the for-loop start block 906 to block 908, where a feature presence function 850 is evaluated for the feature to determine whether the feature should be included in the structural parameters 806. In some embodiments, an input to the feature presence function 850 is some aspect of the parameterization of the feature, such as a size of the feature. In some embodiments, an input to the feature presence function 850 may be a scalar value associated with the feature that represents the feature presence and that may be separately optimized from the size and/or shape of the feature itself.
where c is the location of the feature presence cutoff 1106.
One non-limiting example of potential values illustrated on the X-axis are feature sizes. In other words, if the features are parameterized using geometric shape primitives such as circles, the values on the X-axis may be radius values. The feature presence function 1102 would consider the radius of the circle (the value on the X-axis) to determine whether the circle should be included in the structural parameters 806. A feature presence cutoff 1106 may be positioned at the minimum feature size supported by the fabrication system. Accordingly, if the optimization of the size of the feature causes the radius of the circle to be smaller than the minimum feature size, the feature would no longer be included in the structural parameters 806.
One will note that while the feature presence function 850 is effective in providing a cutoff below which the feature will no longer be included in the structural parameters 806, the feature presence function 850 is poorly differentiable: the gradient at the feature presence cutoff 1106 is undefined, and the gradient at all values below the feature presence cutoff 1106 is zero. As such, the feature presence function 850 cannot itself be used for gradient-based optimization of the feature presence. A solution to this issue is illustrated in
After evaluating the feature presence function 850, the method 900 advances to a decision block 910, where a determination is made regarding whether the feature should be included in the structural parameters 806 based on the result of the feature presence function 850. In some embodiments, the result of the feature presence function 850 may be compared to a threshold value to determine whether the feature should be included. In the non-limiting example feature presence function 1102 illustrated in
Otherwise, if the determination at decision block 910 is that the feature should be included in the structural parameters 806, the method 900 advances to block 912. While using geometric shape primitives to parameterize the design region may greatly simplify the search space to be analyzed during the inverse design process, one problem arises in that the geometric shape primitives may themselves be poorly differentiable. In other words, while the number of parameters to be optimized for geometric shape primitives is much lower than if individual voxels within the design region are optimized, a gradient of the loss metric does not backpropagate well to the geometric shape primitives themselves due to their discrete (not continuous) nature, and so it may be desirable to use an intermediate, continuous representation to convert the geometric shape primitives to structural parameters to be simulated in order to improve differentiability. Accordingly, at block 912, a signed distance field is determined for each of the geometric shape primitives.
In some embodiments, the signed distance field is determined analytically. For each voxel x, y within the signed distance field for a circular geometric shape primitive with a center xc, yc and radius r, the value of the voxel is given as:
It should be noted that though the first geometric shape primitive 1004, second signed distance field 1006, and second geometric shape primitive 1008 are illustrated in
In some embodiments, the features may be specified using other than a geometric shape primitive. In such embodiments, the features may still be represented by a signed distance field, such that a border of the feature is represented by a zero-value contour, and areas inside and outside of the border are increasingly negative or positive as discussed above. An analytic determination of the values of the signed distance field may also be used, or any other suitable technique for determining or specifying distances from the zero-value contour may be used.
Returning to
Once the density field is generated for the feature, the method 900 advances to the for-loop end block 916. If additional features remain to be processed in the list of features, then the method 900 returns to for-loop start block 906 to process the next feature. Otherwise, the method 900 advances to a continuation terminal (“terminal A”).
From terminal A (
After combination of the density fields, a step of binarization may take place such that negative values may be set to a value of zero, and positive values may be set to a value of one, to indicate the presence or absence of a given material for the set of structural parameters 806. In some embodiments, some values within a threshold range of zero may be assigned a real value between zero and one to indicate a partial amount of the voxel to be filled with the given material. In some embodiments, instead of a hard threshold, the values of the density map may be passed through a sigmoid function to assign most values within the density map to zero or one, but to leave a differentiable transition region close to zero so that gradients can pass through the density map during optimization.
Within the simulated environment 606, each of the plurality of voxels is associated with a structural value to describe the structural parameters, a field value to describe the field response (e.g., the electric and magnetic fields in one or more orthogonal directions) to physical stimuli (e.g., one or more excitation sources), and a source value to describe the physical stimuli.
At block 920, a simulated environment 606 is configured to be representative of the set of structural parameters 806. Once the structural parameters 806 are determined, the simulated environment 606 is configured (e.g., the number of voxels, shape/arrangement of voxels, and specific values for the structural value, field value, and/or source value of the voxels are set based on the structural parameters 806).
In some embodiments the simulated environment includes a design region optically coupled between a first communication region and a plurality of second communication regions. In some embodiments, the first communication region may correspond to an input region or port (e.g., where an excitation source originates), while the second communication may correspond to a plurality of output regions or ports (e.g., when designing an optical demultiplexer that optically separates a plurality of distinct wavelength channels included in a multi-channel optical signal received at the input port and respectively guiding each of the distinct wavelength channels to a corresponding one of the plurality of output ports). However, in other embodiments, the first communication region may correspond to an output region or port, while the plurality of second communication regions corresponds to a plurality of input ports or region (e.g., when designing an optical multiplexer that optically combines a plurality of distinct wavelength signals received at respective ones of the plurality of input ports to form a multi-channel optical signal that is guided to the output port).
At block 922, each of a plurality of distinct wavelength channels are mapped to a respective one of the plurality of second communication regions. The distinct wavelength channels may be mapped to the second communication regions by virtue of the design specification. For example, a loss function may be chosen that associates a performance metric of the physical device with power transmission from the input port to individual output ports for mapped channels. In one embodiment, a first channel included in the plurality of distinct wavelength channels is mapped to a first output port, meaning that the performance metric of the physical device for the first channel is tied to the first output port. Similarly, other output ports may be mapped to the same or different channels included in the plurality of distinct wavelength channels such that each of the distinct wavelength channels is mapped to a respective one of the plurality of output ports (i.e., second communication regions) within the simulated environment 606. In one embodiment, the plurality of second communication regions includes four regions and the plurality of distinct wavelength channels includes four channels that are each mapped to a corresponding one of the four regions. In other embodiments, there may be a different number of the second communication regions (e.g., 8 regions) and a different number of channels (e.g., 8 channels) that are each mapped to a respective one of the second communication regions. In some embodiments, only a single input port and a single output port may be included, such as for waveguide bends or other devices intended to change a direction of an incoming signal to another direction.
Block 924 illustrates performing an operational simulation 802 of the physical device within the simulated environment 606 operating in response to one or more excitation sources to determine a performance loss value 820. More specifically, in some embodiments an electromagnetic simulation is performed in which a field response of the photonic integrated circuit is updated incrementally over a plurality of time steps to determine how the how the field response of the physical device changes due to the excitation source. The field values of the plurality of voxels are updated in response to the excitation source and based, at least in part, on the structural parameters 806 of the integrated photonic circuit. Additionally, each update operation at a particular time step may also be based, at least in part, on a previous (e.g., immediately prior) time step.
Consequently, the operational simulation 802 simulates an interaction between the photonic device (i.e., the photonic integrated circuit) and a physical stimuli (i.e., one or more excitation sources) to determine a simulated output of the photonic device (e.g., at one or more of the output ports or regions) in response to the physical stimuli. The interaction may correspond to any one of, or combination of a perturbation, retransmission, attenuation, dispersion, refraction, reflection, diffraction, absorption, scattering, amplification, or otherwise of the physical stimuli within electromagnetic domain due, at least in part, to the structural parameters 806 of the photonic device and underlying physics governing operation of the photonic device. Thus, the operational simulation 802 simulates how the field response of the simulated environment 606 changes due to the excitation source over a plurality of time steps (e.g., from an initial to final time step with a pre-determined step size).
In some embodiments, the simulated output may be utilized to determine one or more performance metrics of the physical device. For example, the excitation source may correspond to a selected one of a plurality of distinct wavelength channels that are each mapped to one of the plurality of output ports. The excitation source may originate at or be disposed proximate to the first communication region (i.e., input port) when performing the operational simulation 802. During the operational simulation 802, the field response at the output port mapped to the selected one of the plurality of distinct wavelength channels may then be utilized to determine a simulated power transmission of the photonic integrated circuit for the selected distinct wavelength channel. In other words, the operational simulation 802 may be utilized to determine the performance metric that includes determining a simulated power transmission of the excitation source from the first communication region, through the design region, and to a respective one of the plurality of second communication regions mapped to the selected one of the plurality of distinct wavelength channels. In some embodiments, the excitation source may cover the spectrum of all of the plurality of output ports (e.g., the excitation source spans at least the targeted frequency ranges for the bandpass regions for each of the plurality of distinct wavelength channels as well as the corresponding transition band regions, and at least portions of the corresponding stopband regions) to determine a performance metric (i.e., simulated power transmission) associated with each of the distinct wavelength channels for the photonic integrated circuit. In some embodiments, one or more frequencies that span the passband of a given one of the plurality of distinct wavelength channels is selected randomly to optimize the design (e.g., batch gradient descent while having a full width of each passband including ripple in the passband that meets the target specifications). In the same or other embodiments, each of the plurality of distinct wavelength channels has a common bandwidth with different center wavelengths. The performance metric may then be used to generate a performance loss value for the initial design 834. The performance loss value may correspond to a difference between the performance metric and a target performance metric of the physical device.
In some embodiments, the loss metric 822 may include terms in addition to the performance loss value in order to optimize different aspects of the initial design 834. For example, in some embodiments, a term for a fabrication loss value may be included in the loss metric 822. One advantage of the use of geometric shape primitives is the particular ease with which compliance fabrication constraints can be determined and included within the loss metric 822.
Returning to
Block 928 illustrates backpropagating the loss metric 822 through the simulated environment 606 to determine a feature gradient 852 similarly to the determination of the structural gradient. As discussed above, the feature presence function 850 is typically of a form through which gradients will not naturally flow during this process. Accordingly, a different function may be used for the feature gradient 852 than was used for the feature presence function 850.
wherein DZS is a “dead zone slope” of the function within the dead zone 1108, and c is the location of the feature presence cutoff 1106. The value of ymin represents the y-value of the function at the point of cutoff, and the value of ymax represents the y-value of the function at x=1, such that these two values define the slope of the function in the linear region. The value of b controls the width of the smooth transition region between the dead zone slope and the linear regions. Larger values of b make the transition between the dead zone 1108 and the linear regions sharper, and smaller values of b make the transition between the dead zone 1108 and the linear regions more gradual. In the illustrated embodiment, suitable values for these variables are c=0.3, ymin=3, ymax=10, and DZS=0.5. In some embodiments, each of these variables may be adjusted or specified by the design specification. In some embodiments, each of these variables may be tuned as a hyperparameter as part of the optimization. One will note that although for values of x near c (i.e., within the dead zone 1108), the function does not precisely match the value of y in the feature presence function 850, but the feature gradient function 1104 does allow the feature gradient 852 to be computed.
Returning to
via:
with
being obtainable from the analytic function defining the values for the signed distance field provided above. Similarly, the backpropagation may be applied concurrently (or sequentially) using the feature gradient 852 to the values that are provided as input to the feature presence function 850 to further adjust the loss metric.
In some embodiments, adjusting for the loss metric may reduce the loss metric. However, in other embodiments, the loss metric may be adjusted or otherwise compensated in a manner that does not necessarily reduce the loss metric. In some embodiments, the revised description is generated by utilizing an optimization scheme after a cycle of operational and adjoint simulations via a gradient descent algorithm, Markov Chain Monte Carlo algorithm, or other optimization techniques. Put in another way, iterative cycles of simulating the physical device, determining a loss metric, backpropagating the loss metric, updating the structural parameters to adjust the loss metric, and updating the geometric shape primitives using the signed distance fields may be successively and iteratively performed until the loss metric substantially converges such that the difference between the performance metric and the target performance metric is within a threshold range. In some embodiments, the term “converges” may simply indicate the difference is within the threshold range and/or below some threshold value.
At decision block 932, a determination is made regarding whether optimization of the design of the physical device is done. In some embodiments, optimization of the design of the physical device may be done when it is determined that the loss metric 822 has reached an acceptable value, such as a value specified by the design specification. In some embodiments, optimization of the design of the physical device may be done after a predetermined number of iterations.
If the determination is that optimization is not yet done, then the result of decision block 932 is NO, and the method 900 returns to block 912 to iterate on the updated features. Otherwise, if the determination is that optimization is done, then the result of decision block 932 is YES and the method 900 advances to block 934.
Block 934 illustrates outputting the updated design of the physical device. The updated design may be output to a computer-readable medium for storage and later operations, including but not limited to fabrication, further optimization, or inclusion in additional designs. In some embodiments, the updated design may be output to a fabrication system for fabrication of the physical device. In some embodiments, the updated design may be output to the fabrication system by providing a grid of voxels that each indicate a material to be included at a corresponding position of the physical device. In some embodiments, the updated design may be output to the fabrication system by outputting the list of geometric shape primitives itself, which may then be ingested by the fabrication system for fabricating the physical device.
The method 900 then proceeds to an end block and terminates.
By using a differentiable function to generate the feature gradient 852, gradient optimization can be used to both add and remove features from the structural parameters 806. Since the slope of the feature gradient 852 within the dead zone 1108 is defined, and a non-zero slope is provided for values below the dead zone 1108, gradient optimization can allow values to move both ways past the value for the feature presence cutoff 1106 value during iterations of the optimization loop.
In
In the preceding description, numerous specific details are set forth to provide a thorough understanding of various embodiments of the present disclosure. One skilled in the relevant art will recognize, however, that the techniques described herein can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring certain aspects.
Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
The order in which some or all of the blocks appear in each method flowchart should not be deemed limiting. Rather, one of ordinary skill in the art having the benefit of the present disclosure will understand that actions associated with some of the blocks may be executed in a variety of orders not illustrated, or even in parallel.
The processes explained above are described in terms of computer software and hardware. The techniques described may constitute machine-executable instructions embodied within a tangible or non-transitory machine (e.g., computer) readable storage medium, that when executed by a machine will cause the machine to perform the operations described. Additionally, the processes may be embodied within hardware, such as an application specific integrated circuit (“ASIC”) or otherwise.
The above description of illustrated embodiments of the invention, including what is described in the Abstract, is not intended to be exhaustive or to limit the invention to the precise forms disclosed. While specific embodiments of, and examples for, the invention are described herein for illustrative purposes, various modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize.
These modifications can be made to the invention in light of the above detailed description. The terms used in the following claims should not be construed to limit the invention to the specific embodiments disclosed in the specification. Rather, the scope of the invention is to be determined entirely by the following claims, which are to be construed in accordance with established doctrines of claim interpretation.