We chose to use Golden Gate Assembly because it allowed us to create libraries of multiple parts in a single cloning step. Golden Gate Assembly is powered by type IIs restriction enzymes, which have many key features that should be mentioned:
- They cut distal to their recognition sites. Because of this, we have full control over the resulting 4bp overhangs, giving us a theoretical choice between 4^4, or 256 different overhangs. In practice, we reduced this set, eliminating cases where more than two out of four bases match to minimize chances of mis-annealing.
- They are not palindromic, and thus have directionality. Because the recognition site is non-palindromic, the cutting can either happen 5' or 3' to the recognition site. This can be modulated by reversing the sequence.
- They can cut themselves out. Depending on the direction of cutting, the site can be cut out of the desired fragment so that it does not undergo digestion after it has been ligated.
An example type IIs restriction enzyme, BsmBI. In this diagram, a GACC overhang is created.
In classic Golden Gate Assembly, these type IIs enzymes, together with T4 DNA Ligase, glue together these overhangs. We began with various part plasmids and a backbone all in a single pot. These ligate together in a single reaction that oscillates between digestion and ligation to move towards a correct product. Because type IIs endonucleases cut distal to their recognition site, once they form a connection with the correct neighbor, the product is fixed because the recognition site no longer exits in the connected result.
In our MiCode Assembly (MCA), we utilized two type IIs enzymes, BsaI and BsmBI. By alternating the usage of these two enzymes in each round of assembly and reintroducing the sites when necessary, we can iteratively build up larger constructs. We designed MCA to emphasize interchangeability that allows for iterative set expansion which enables us to efficiently build MiCodes.
The basic, fundamental unit of our GGA is the part plasmid, shown below. This is synonymous with the part plasmids kept in the registry, but instead of being flanked with EcoRI, XbaI, SpeI, and PstI, it is flanked by outer BsaI sites. These BsaI sites create unique overhangs when digested, which anneal with other complementary overhangs in a single-pot reaction. Conveniently, BsaI and BsmBI are compatible with T4 DNA Ligase, allowing for the two to be combined in a single pot reaction that cycles between optimal temperatures for digestion (37˚C) and ligation (16˚C). Because correct products are fixed, the system converges towards our final product. We use plasmids because they allow us to easily amplify and sequence the DNA.
Each part is flanked by unique overhangs that dictate its potential neighbors, its position in the cassette, and the type of part it is (detailed below).
A part plasmid coding for PAmCherry, our photoactivatible red fluorescent protein.
Several part plasmids can be joined via their overhangs to produce a cassette. We break up each cassette into several parts, each which has overhangs unique to that type of part. These overhangs make sure the cassette assembles correctly, with each part being in its correct position. For MCA, we have the backbone with origin and marker, 5' upstream region, promoter, part, terminator, and 3' downstream region.
For each position in the cassette, there are multiple parts with compatible overhangs (each position has multiplicity), which allows us to easily interchange parts from the library of part plasmids we and the Dueber Lab have made. In this cassette assembly step, all the components of this cassette are physically linked together. The physical linkage retains the data about this unit of the MiCode in such a way that we can utilize it downstream. This allows us to treat the entire cassette as a single unit and enables us to build larger constructs using the cassettes with the confidence that the genotype remains stable.
A cassette coding for nucleus-localized PAmCherry with ConS and Con2 connector regions.
In the next round of assembly, we combine several cassettes together to form multigene cassettes. Because the multigene cassettes build off of the cassettes, they retain their information in downstream usage. And because we build each multigene cassette by hand, we know exactly what the genotype and expected phenotype of each multigene cassette is. At each position between connector regions, there are multiple cassettes we can substitute in to create our desired MiCode. This modularity enables us to build a large combinatorial set of MiCodes with only a few cassettes.
A multigene cassette coding linking various MiCode components to a leucine zipper. This half codes for the bait zipper.
MiCode components linked to a prey zipper.
Up until this point, we have been building one construct per reaction because we wanted to know precisely what we were building. After we build a library of bait and a separate library of prey zippers by hand, we can combine the two together in a large single-pot reaction. Because we designed the bait half-MiCode to only be compatible with the prey half-MiCode, we can ensure that the assembly links exactly one prey to one bait. Additionally, because all components on each half-code were physically linked on the plasmid in the previous round of cloning, we can link the phenotype expressed by the fluorescence to the leucine zipper.
The full MiCode assembly, done in a single-pot reaction to combine the 40 bait and 40 prey zippers into a 1,600 member library.