All of the prospective target domains are put through
a battery of computational tools; those proteins predicted
to be membranous, unstructured or otherwise unsuitable
are immediately removed from the pool as being
intractable.Next, database searches are used, and proteins
that can be computationally modelled by homology to
known structures are also set aside. The remaining candidates
are all valid ‘structural genomics proteins,’ as they
are thought to be tractable, and their experimental characterization
will provide structural information that
could not have been predicted. Priority is assigned to
families of structural genomics proteins according to
their desirable characteristics62, such as phylogenetic distribution63,64,
family size46, likelihood of producing a new
fold65,66 and functional relevance67.